56 research outputs found

    The Plant Orthology Browser: An Orthology and Gene-Order Visualizer for Plant Comparative Genomics

    Get PDF
    Worldwide genome sequencing efforts for plants with medium and large genomes require identification and visualization of orthologous genes, while their syntenic conservation becomes the pinnacle of any comparative and functional genomics study. Using gene models for 20 fully sequenced plant genomes, including model organisms and staple crops such as Aegilops tauschii Coss., Arabidopsis thaliana (L.) Heynh., Brachypodium distachyon (L.) Beauv., turnip (Brassica rapa L.), barley (Hordeum vulgare L.), rice (Oryza sativa L.), sorghum [Sorghum bicolor (L.) Moench], wheat (Triticum aestivum L.), red wild einkorn (Triticum urartu Tumanian ex Gandilyan), and maize (Zea mays L.), we computationally predicted 1,021,611 orthologs using stringent sequence similarity criteria. For each pair of plant species, we determined sets of conserved synteny blocks using strand orientation and physical mapping. Gene ontology (GO) annotations are added for each gene. Plant Orthology Browser (POB) includes three interconnected modules: (i) a gene‐order visualization module implementing an interactive environment for exploration of gene order between any pair of chromosomes in two plant species, (ii) a synteny visualization module providing unique interactive dot plot representations of orthologous genes between a pair of chromosomes in two distinct plant species, and (iii) a search module that interconnects all modules via free‐text search capability with online as‐you‐type suggestions and highlighting that allows exploration of the underlining information without constraint of interface‐dependent search fields. The POB is a web‐based orthology and annotation visualization tool, which currently supports 20 completely sequenced plant species with considerably large genomes and offers intuitive and highly interactive pairwise comparison and visualization of genomic traits via gene orthology

    The microarray manual curation tool (MMCT): A Web server for microarray probe evaluations

    Get PDF
    Quality control of probe sequences is a major concern in microarray technology. The presence of poor quality probes has a negative impact on the microarray data analysis process. The Microarray Manual Curation Tool (MMCT) is a web server application that provides computational and visual means to investigate the quality of individual probes for oligo microarrays. The MMCT quality metrics assess the free energy of hybridization and the secondary structure of duplexes formed by selected targets and probes, which are specific to various microarray platforms

    Temporal ordering of substitutions in RNA evolution : uncovering the structural evolution of the human accelerated region 1

    Get PDF
    The Human Accelerated Region 1 (HAR1) is the most rapidly evolving region in the human genome. It is part of two overlapping long non-coding RNAs, has a length of only 118 nucleotides and features 18 human specific changes compared to an ancestral sequence that is extremely well conserved across non-human primates. The human HAR1 forms a stable secondary structure that is strikingly different from the one in chimpanzee as well as other closely related species, again emphasizing its human-specific evolutionary history. This suggests that positive selection has acted to stabilize human-specific features in the ensemble of HAR1 secondary structures. To investigate the evolutionary history of the human HAR1 structure, we developed a computational model that evaluates the relative likelihood of evolutionary trajectories as a probabilistic version of a Hamiltonian path problem. The model predicts that the most likely last step in turning the ancestral primate HAR1 into the human HAR1 was exactly the substitution that distinguishes the modern human HAR1 sequence from that of Denisovan, an archaic human, providing independent support for our model. The MutationOrder software is available for download and can be applied to other instances of RNA structure evolution

    Predicting environmental stressor levels with machine learning: a comparison between amplicon sequencing, metagenomics, and total RNA sequencing based on taxonomically assigned data

    Get PDF
    IntroductionMicrobes are increasingly (re)considered for environmental assessments because they are powerful indicators for the health of ecosystems. The complexity of microbial communities necessitates powerful novel tools to derive conclusions for environmental decision-makers, and machine learning is a promising option in that context. While amplicon sequencing is typically applied to assess microbial communities, metagenomics and total RNA sequencing (herein summarized as omics-based methods) can provide a more holistic picture of microbial biodiversity at sufficient sequencing depths. Despite this advantage, amplicon sequencing and omics-based methods have not yet been compared for taxonomy-based environmental assessments with machine learning.MethodsIn this study, we applied 16S and ITS-2 sequencing, metagenomics, and total RNA sequencing to samples from a stream mesocosm experiment that investigated the impacts of two aquatic stressors, insecticide and increased fine sediment deposition, on stream biodiversity. We processed the data using similarity clustering and denoising (only applicable to amplicon sequencing) as well as multiple taxonomic levels, data types, feature selection, and machine learning algorithms and evaluated the stressor prediction performance of each generated model for a total of 1,536 evaluated combinations of taxonomic datasets and data-processing methods.ResultsSequencing and data-processing methods had a substantial impact on stressor prediction. While omics-based methods detected a higher diversity of taxa than amplicon sequencing, 16S sequencing outperformed all other sequencing methods in terms of stressor prediction based on the Matthews Correlation Coefficient. However, even the highest observed performance for 16S sequencing was still only moderate. Omics-based methods performed poorly overall, but this was likely due to insufficient sequencing depth. Data types had no impact on performance while feature selection significantly improved performance for omics-based methods but not for amplicon sequencing.DiscussionWe conclude that amplicon sequencing might be a better candidate for machine-learning-based environmental stressor prediction than omics-based methods, but the latter require further research at higher sequencing depths to confirm this conclusion. More sampling could improve stressor prediction performance, and while this was not possible in the context of our study, thousands of sampling sites are monitored for routine environmental assessments, providing an ideal framework to further refine the approach for possible implementation in environmental diagnostics

    Thermodynamically based DNA strand design

    Get PDF
    We describe a new algorithm for design of strand sets, for use in DNA computations or universal microarrays. Our algorithm can design sets that satisfy any of several thermodynamic and combinatorial constraints, which aim to maximize desired hybridizations between strands and their complements, while minimizing undesired cross-hybridizations. To heuristically search for good strand sets, our algorithm uses a conflict-driven stochastic local search approach, which is known to be effective in solving comparable search problems. The PairFold program of Andronescu et al. [M. Andronescu, Z. C. Zhang and A. Condon (2005) J. Mol. Biol., 345, 987–1001; M. Andronescu, R. Aguirre-Hernandez, A. Condon, and H. Hoos (2003) Nucleic Acids Res., 31, 3416–3422.] is used to calculate the minimum free energy of hybridization between two mismatched strands. We describe new thermodynamic measures of the quality of strand sets. With respect to these measures of quality, our algorithm consistently finds, within reasonable time, sets that are significantly better than previously published sets in the literature

    Effects of Clipping of Flight Feathers on Resource Use in Gallus gallus domesticus

    Full text link
    Ground-dwelling species of birds, such as domestic chickens (Gallus gallus domesticus), experience difficulties sustaining flight due to high wing loading. This limited flight ability may be exacerbated by loss of flight feathers that is prevalent among egg-laying chickens. Despite this, chickens housed in aviary style systems need to use flight to access essential resources stacked in vertical tiers. To understand the impact of flight feather loss on chickens’ ability to access elevated resources, we clipped primary and secondary flight feathers for two hen strains (brown-feathered and white-feathered, n = 120), and recorded the time hens spent at elevated resources (feeders, nest-boxes). Results showed that flight feather clipping significantly reduced the percentage of time that hens spent at elevated resources compared to ground resources. When clipping both primary and secondary flight feathers, all hens exhibited greater than or equal to 38% reduction in time spent at elevated resources. When clipping only primary flight feathers, brown-feathered hens saw a greater than 50% reduction in time spent at elevated nest-boxes. Additionally, brown-feathered hens scarcely used the elevated feeder regardless of treatment. Clipping of flight feathers altered the amount of time hens spent at elevated resources, highlighting that distribution and accessibility of resources is an important consideration in commercial housing

    Predicting dry matter intake in Canadian Holstein dairy cattle using milk mid-infrared reflectance spectroscopy and other commonly available predictors via artificial neural networks.

    Get PDF
    Dry matter intake (DMI) is a fundamental component of the animal's feed efficiency, but measuring DMI of individual cows is expensive. Mid-infrared reflectance spectroscopy (MIRS) on milk samples could be an inexpensive alternative to predict DMI. The objectives of this study were (1) to assess if milk MIRS data could improve DMI predictions of Canadian Holstein cows using artificial neural networks (ANN); (2) to investigate the ability of different ANN architectures to predict unobserved DMI; and (3) to validate the robustness of developed prediction models. A total of 7,398 milk samples from 509 dairy cows distributed over Canada, Denmark, and the United States were analyzed. Data from Denmark and the United States were used to increase the training data size and variability to improve the generalization of the prediction models over the lactation. For each milk spectra record, the corresponding weekly average DMI (kg/d), test-day milk yield (MY, kg/d), fat yield (FY, g/d), and protein yield (PY, g/d), metabolic body weight (MBW), age at calving, year of calving, season of calving, days in milk, lactation number, country, and herd were available. The weekly average DMI was predicted with various ANN architectures using 7 predictor sets, which were created by different combinations MY, FY, PY, MBW, and MIRS data. All predictor sets also included age of calving and days in milk. In addition, the classification effects of season of calving, country, and lactation number were included in all models. The explored ANN architectures consisted of 3 training algorithms (Bayesian regularization, Levenberg-Marquardt, and scaled conjugate gradient), 2 types of activation functions (hyperbolic tangent and linear), and from 1 to 10 neurons in hidden layers). In addition, partial least squares regression was also applied to predict the DMI. Models were compared using cross-validation based on leaving out 10% of records (validation A) and leaving out 10% of cows (validation B). Superior fitting statistics of models comprising MIRS information compared with the models fitting milk, fat and protein yields suggest that other unknown milk components may help explain variation in weekly average DMI. For instance, using MY, FY, PY, and MBW as predictor variables produced a predictive accuracy (r) ranging from 0.510 to 0.652 across ANN models and validation sets. Using MIRS together with MY, FY, PY, and MBW as predictors resulted in improved fitting (r = 0.679-0.777). Including MIRS data improved the weekly average DMI prediction of Canadian Holstein cows, but it seems that MIRS predicts DMI mostly through its association with milk production traits and its utility to estimate a measure of feed efficiency that accounts for the level of production, such as residual feed intake, might be limited and needs further investigation. The better predictive ability of nonlinear ANN compared with linear ANN and partial least squares regression indicated possible nonlinear relationships between weekly average DMI and the predictor variables. In general, ANN using Bayesian regularization and scaled conjugate gradient training algorithms yielded slightly better weekly average DMI predictions compared with ANN using the Levenberg-Marquardt training algorithm

    Unravelling the genetics of non-random fertilization associated with gametic incompatibility

    Get PDF
    In the dairy industry, mate allocation is dependent on the producer’s breeding goals and the parents’ breeding values. The probability of pregnancy differs among sire-dam combinations, and the compatibility of a pair may vary due to the combination of gametic haplotypes. Under the hypothesis that incomplete incompatibility would reduce the odds of fertilization, and complete incompatibility would lead to a non-fertilizing or lethal combination, deviation from Mendelian inheritance expectations would be observed for incompatible pairs. By adding an interaction to a transmission ratio distortion (TRD) model, which detects departure from the Mendelian expectations, genomic regions linked to gametic incompatibility can be identified. This study aimed to determine the genetic background of gametic incompatibility in Holstein cattle. A total of 283,817 genotyped Holstein trios were used in a TRD analysis, resulting in 422 significant regions, which contained 2075 positional genes further investigated for network, overrepresentation, and guilt-by-association analyses. The identified biological pathways were associated with immunology and cellular communication and a total of 16 functional candidate genes were identified. Further investigation of gametic incompatibility will provide opportunities to improve mate allocation for the dairy cattle industry

    InnateDB: facilitating systems-level analyses of the mammalian innate immune response

    Get PDF
    Although considerable progress has been made in dissecting the signaling pathways involved in the innate immune response, it is now apparent that this response can no longer be productively thought of in terms of simple linear pathways. InnateDB (www.innatedb.ca) has been developed to facilitate systems-level analyses that will provide better insight into the complex networks of pathways and interactions that govern the innate immune response. InnateDB is a publicly available, manually curated, integrative biology database of the human and mouse molecules, experimentally verified interactions and pathways involved in innate immunity, along with centralized annotation on the broader human and mouse interactomes. To date, more than 3500 innate immunity-relevant interactions have been contextually annotated through the review of 1000 plus publications. Integrated into InnateDB are novel bioinformatics resources, including network visualization software, pathway analysis, orthologous interaction network construction and the ability to overlay user-supplied gene expression data in an intuitively displayed molecular interaction network and pathway context, which will enable biologists without a computational background to explore their data in a more systems-oriented manner
    • 

    corecore